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Abstract —Bounded model checking (BMC) is a well-known 
and successful technique for finding bugs in software, fc-lnductlon 
is an approach to extend BMC-based approaches from falsifica¬ 
tion to verification. Antomatically generated auxiliary invariants 
can be used to strengthen the induction hypothesis. We improve 
this approach and further Increase effectiveness and efficiency 
in the following way: we start with light-weight invariants and 
refine these invariants continuously during the analysis. We 
present and evaluate an implementation of our approach in 
the open-sonrce verification-framework CPAchecker. Our experi¬ 
ments show that combining /^-induction with continuonsly-refined 
invariants significantly Increases effectiveness and efficiency, and 
outperforms all existing implementations of fe-induction-based 
software verification in terms of successful verification results. 

I. Introduction 

Advances in software verification in the recent years have 
lead to increased efforts towards applying formal verification 
methods to industrial software, in particular operating-systems 
code @127). One model-checking technique that is imple¬ 
mented by more than half of the verifiers that participated in 
the 2014 Competition on Software Verification 0 is bounded 
model checking (BMC) | |13) . For unbounded systems, BMC 
can be used only for falsification, not for verification GD- 
This limitation to falsification can be overcome by combining 
BMC with mathematical induction and thus extending it to 
verification ]20| . Unfortunately, inductive approaches are not 
always powerful enough to prove the required verification 
conditions, because not all program invariants are inductive Q. 
This problem can be mitigated by using the more general 
^-induction instead of the standard induction | |^ , an approach 
which has already been implemented in the DMA-race analysis 
tool Scratch 113 and in the software verifier Esbmc | |29| . 
Nevertheless, additional supportive measures are often required 
to guide fc-induction and take advantage of its full potential . 
Our goal is to provide a powerful and competitive approach for 
reliable, general-purpose software verification based on BMC 
and fc-induction, implemented in a state-of-the-art software 
verification framework. 

Our contribution is a new combination of fc-induction-based 
model checking with automatically-generated continuously- 
refined invariants that are used to strengthen the induction 
hypothesis, which increases the effectiveness of the approach. 
BMC and fc-induction are combined in an algorithm that 
iteratively increments the induction parameter fc. The invariant 
generation runs in parallel to the fc-induction proof construction, 
starting with relatively weak (but inexpensive to compute) 


invariants, and increasing the strength of the invariants over 
time as long as the analysis continues. The fc-induction-based 
proof construction adopts the currently known set of invariants 
in every new proof attempt. This approach can verify easy 
problems quickly (with a small initial fc and weak invariants), 
and is able to verify complex problems by increasing the 
effort (by incrementing fc and searching for stronger invariants). 
Thus, it is both efficient and effective. In contrast to previous 
work ]29| , the new approach is sound. We implemented 
our approach as part of the open-source software-verification 
framework CPAchecker |10|, and we perform an extensive 
experimental comparison of our implementation against the two 
existing tools that use similar techniques and against another 
successful software-verification approach. 


Availability of Data and Tools. Our experiments are based 
on benchmark verification tasks from the 2015 Competition 
on Software Verification. All benchmarks, tools, and results of 
our evaluation are available on a supplementary web page[^ 

Contributions. We make the following novel contributions: 
We develop an approach for providing continuously refined 
invariants to fc-induction by using configurable program anal¬ 
ysis with precision refinement. We also present an extensive 
evaluation where we compare various different approaches and 
implementations against the implementation of our proposed 
approach and show that our technique outperforms other 
approaches to software verification with fc-induction. 


Example. We illustrate the open problem of fc-induction that 
we address, and the strength of our approach, on two example 
programs. Both programs encode an automaton, which is 
typical, e.g., for software that implements a communication 
protocol. The automaton has a finite set of states, which is 
encoded by variable s, and two data variables xl and x2. 
There are some state-dependent calculations (lines 5 and 6 in 
both programs) that altematingly increment xl and x2, and a 
calculation of the next state (lines 8 and 9 in both programs). 
The state variable cycles through the range from 1 to 4. These 
calculations are done in a loop with a non-deterministic number 
of iterations. Both programs also contain a safety property 
(the label ERROR should not be reachable). The program 
example-safe in Fig. [1] checks that in every fourth state, 
the values of xl and x2 are equal; it satisfies the property. 
The program example-unsafe in Fig. [^checks that when 
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1 ; 


1 ; 


1 int main ( ) { 

2 unsigned int xl = 0, x2 = 0; int s = 

3 

4 while (nondet()) { 

5 if {s == 1) xl++; 

6 else if {s == 2) x2++; 

7 

8 S++; 

9 if (s == 5) s = 1; 

10 

11 if { {s == 1) && {xl != x2) ) { 

12 // Valid safety property 

13 ERROR: return 1; 

14 } 

15 } 

16 } 

Fig. 1: Safe example program example-safe, which cannot 
be proven with existing /c-induction-based approaches 


1 int main () { 

2 unsigned int xl = 0, x2 = 0; int s = 

3 

4 while (nondet 0) { 

5 if (s == 1) xl++; 

6 else if (s == 2) x2++; 

7 

8 S++; 

9 if (s == 5) s = 1; 

10 } 

11 

12 if (s >= 4) { 

13 // Invalid safety property (s may be 4) 

14 ERROR: return 1 ; 

15 } 

16 } 

Fig. 2: Unsafe example program example-unsafe, where 
Esbmc produces a wrong proof 


the loop exits, the value of state variable s is not greater or 
equal to 4; it violates the property. 

First, note that the program example-safe is difficult or 
impossible to prove with other software-verification approaches: 
(1) BMC cannot prove safety for this program because the 
loop may mn arbitrarily long. (2) Explicit-state model checking 
fails because of the huge state space (xl and x2 can get 
arbitrarily large). (3) Predicate analysis with counterexample- 
guided abstraction refinement (CEGAR) and interpolation is 
able to prove safety, but only if the predicate xl = x2 gets 
discovered. If the interpolants contain instead only predicates 
such as xl = x2 = 1, xl = 2, etc., the analysis will not 
terminate. Which predicates get discovered is hard to control 
and usually depends on internal interpolation heuristics of 
the satisfiability-modulo-theory (SMT) solver. (4) Traditional 
1-induction is also not able to prove the program safe because 
the assertion is checked only in every fourth loop iteration 
(when s is 1). Thus, the induction hypothesis is too weak (the 
program state s = 4,xl = 0,x2 = lisa counterexample 
for the step case in the induction proof). 

Intuitively, this program should be provable by fc-induction 
with a fc of at least 4. However, for every k, there is a 
counterexample to the inductive-step case that refutes the proof. 
Eor such a counterexample, set s = —k, xl = 0, x2 = 1 
at the beginning of the loop. Starting in this state, the program 
would increment s k times (induction hypothesis) and then 
reach s = 1 with property-violating values of xl and x2 
in iteration k + 1 (inductive step). It is clear that s can 
never be negative, but this fact is not present in the induction 
hypothesis, and thus the proof fails. This illustrates the general 
problem of fc-induction-based verification: safety properties 
often do not hold in unreachable parts of the state space of a 
program, and fc-induction alone does not distinguish between 
reachable and unreachable parts of the state space. If Esbmc with 
fc-induction analyzes program example-safe, the analysis 
—as expected— iteratively increments fc and loops infinitely, 
failing to prove safety. 

This program could of course be verified more easily if 
it were rewritten to contain a stronger safety property such 


ass>lAs<4A(s = 2^a;f =a;^-|-l)A(s7^2^ 
xl = x2) (which is a loop invariant and allows a proof by 
1-induction without auxiliary invariants). However, our goal is 
to automatically verify real programs, and programmers usually 
neither write down trivial properties such as s > 1 nor too 
complex properties such as s 2 ^ xl = x2. 

With our approach of combining fc-induction with invariants, 
the program is proved safe with fc = 4 and the invariant 
s > 1. This invariant is easy to find automatically using an 
inexpensive static analysis, such as an interval analysis. For 
bigger programs, a more complex invariant might be necessary, 
which might get generated at some point by our continuous 
strengthening of the invariant. Furthermore, stronger invariants 
can reduce the fc that is necessary to prove a program. For 
example, the invariant s>lAs<4A(s^2^a;f = x2) 
(which is still weaker than the full loop invariant above) allows 
to prove the program with fc = 2. Thus, our strengthening of 
invariants can also shorten the inductive proof procedure and 
lead to better performance. 

Esbmc p9) tries to solve this problem of a too-weak induction 
hypothesis by initializing only the variables of the loop- 
termination condition to a non-deterministic value in the step 
case, and initializing all other variables to their initial value 
in the program. However, this approach is not strong enough 
for the program example-safe and even produces a wrong 
proof (unsound result) for the program example-unsafe. 
This second example program contains a different safety 
property about s, which is violated. Because the variable s 
does not appear in the loop-termination condition, it is not 
set to an arbitrary value in the step case as it should be, and 
the inductive proof wrongly concludes that the program is 
safe because the induction hypothesis is too strong. Esbmc 
misses the bug in this program and claims it is correct. Our 
approach does not suffer from this unsoundness, because we 
only add invariants to the induction hypothesis that the invariant 
generation had proven to hold. 


Related Work. The use of auxiliary invariants is a common 
technique in software verification | |15| |, |22|, ]24| , and tech¬ 
niques combining abstract interpretation and SMT solvers also 
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exist ]25) . In most cases, the purpose is to speed up the 
analysis. For fc-induction, however, the use of invariants is 
crucial in making the analysis terminate at all (cf. Fig. 0. 
There are several approaches to software verification using 
BMC in combination with /c-induction. 


Split-Case Induction. We use the split-case k-induction tech¬ 
nique |20||21| , where the base case and the step case are 
checked in separate steps. Due to the fact that this technique is 
only able to handle one loop at a time, another similarity to the 
approach of the earlier versions of Scratch is the trans¬ 
formation of programs with multiple loops into programs with 
only one single monolithic loop using a standard approach |[T]. 
The alternative of recursively applying the technique to nested 
loops is discarded by the authors of Scratch pT| , because 
the experiments suggested it was less efficient than checking 
the single loop that Is obtained by the transformation. Scratch 
also supports combined-case k-induction ]19| , for which all 
loops are cut by replacing them with k copies each for the 
base and the step case, and setting all loop-modlhed variables 
to non-deterministic values before the step case. That way, 
both cases can be checked at once in the transformed program 
and no special handling for multiple loops is required. When 
using combined-case fc-induction. Scratch requires loops to 
be manually annotated with the required k values, whereas 
its implementation of split-case /c-induction supports iterative 
deepening of k as in our implementation. Contrary to Scratch, 
we do not focus on one specific problem domain |20 2TJ, 
but want to provide a solution for solving a wide range of 
heterogeneous verihcation tasks. 

Auxiliary Invariants. While both the split-case and the 
combined-case fc-induction supposedly succeed with weaker 
auxiliary invariants than for example the inductive invariant 
approach Q, the approaches still do require auxiliary invariants 
in practice, and the tool Scratch requires these invariants 
to be annotated manually |19||2H . There are techniques for 
automatically generating invariants that may be used to help 
inductive approaches to succeed |[^|^ 161. These techniques, 
however, are not guaranteed to justify their additional effort 
by providing the required invariants on time, especially if 
strong auxiliary invariants are required. Based on previous 
ideas of supporting fc-induction with invariants generated by 
lightweight static analysis m, we therefore strive to leverage 
the power of the fc-induction approach to succeed with auxiliary 
invariants generated by a static analysis based on intervals. 
However, to handle cases where It is necessary to invest more 
effort into invariant generation, we increase the precision of 
these invariants over time. A verihcation tool using a strategy 
similar to ours is PKind P2|[26| |, a model checker for Lustre 
programs based on fc-induction. In PKind, there is a parallel 
computation of auxiliary invariants, where potential Invariants 
derived by templates are iteratively checked via fc-induction 
and, if successful, added to the set of known Invariants. While 
this allows for strengthening the induction hypothesis over 
time, the template-based approach lacks the hexibility that is 
available to an invariant generator using dynamic precision 


rehnement Q, and the required additional induction proofs 
are potentially expensive. 


Unsound Strengthening of Induction Hypothesis. Esbmc does 
not require additional invariants for fc-induction, because it 
assigns non-deterministic values only to the loop-termination 
condition variables before the inductive-step case |29| and 
thus retains more information than our as well as the Scratch 
Implementation 119 2^, but fc-lnductlon in Esbmc is therefore 
potentially unsound. Our goal is to perform a real proof of 
safety by removing all pre-loop information in the step case, 
thus treating the unrolled iterations in the step case tmly as "any 
fc consecutive iterations", as is required for the mathematical 
induction. Our approach counters this lack of information by 
employing incrementally improving invariant generation. 


Parallel Induction. In PKind, base case and step case are 
checked in parallel, and the latest version of Esbmc, version 
1.23, supports parallel execution of the base case, the forward 
condition, and the inductive-step case. In contrast, our base 
case and inductive-step case are checked sequentially, while 
our invariant generation mns in parallel to the aforementioned 
base- and step-case checks. 


II. Background 

We briefly explain existing concepts that our approach uses. 

Programs. We use the same notion of programs to describe 
the theoretical aspects of our Ideas as in previous work Q. The 
presentation of our work is restricted to a simple imperative 
programming language that contains only assume operations 
and assignments. All variables are assumed to be integers]^ 
Programs are represented by control-flow automata. A control- 
flow automaton (CFA) consists of a set L of program locations, 
modeling the program counter I, the initial program location Iq, 
modeling the program entry, and a set G C L x Ops x L of 
control-flow edges, each of which models the operation that is 
executed during the flow of control from one program location 
to another. The variables that occur In operations from Ops are 
contained In the set X of program variables. In our presentation, 
we assume that each program contains at most one loop. In 
our implementation, we handle programs with multiple loops 
by transforming all loops Into a single monolithic loop Q. 

Configurable Program Analysis. We use the concepts of con- 
flgurable program analysis (CPA) Q with dynamic precision 
adjustment Q. A CPA defines an abstract domain and a transfer 
relation, together with a merge operator to specify what happens 
at meet points in the control-flow and a stop operator to specify 
the fixed-point conditions. The software-verification framework 
CPAchecker allows plugging in CPAs as components, and 
CPAs can be reused and combined, such that common tasks 
like tracking the program counter or the call stack do not need 
to be considered in every single analysis. The CPA algorithm 
optionally merges (as defined by the merge operator) newly- 
discovered abstract states with previously existing abstract 

^Our implementation is based on CPAchecker, which supports C programs. 
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states to produce an abstract state covering both states, over¬ 
approximating them. This over-approximation may result in a 
loss of information, but reduces the amount of states in favor 
of efficiency. Each abstract state is paired with a precision, 
which specifies how fine-grained the analysis should work (to 
hnd a compromise between being efficient and precise). 
Bounded Model Checking. The technique of bounded model 
checking (BMC) E) was originally introduced as alternative to 
binary decision diagrams (BDD) in symbolic model checking, 
to produce counterexamples more quickly, and to speed up 
verification in general. Classic BMC reduces model checking 
to propositional satisfiability (SAT): Only counterexamples 
up to a given length k are considered and a propositional 
formula / is constructed such that / is satisfiable iff such a 
counterexample exists. A SAT solver can be used to check the 
satisfiability of / and, if / is satisfiable, the counterexample can 
be reconstructed from the model for /, which is provided by the 
SAT solver. However, if / is unsatisfiable, no counterexample 
with a length smaller than or equal to k exists. Thus, unless it 
is known that all reachable states are covered by BMC with 
length k, the absence of longer counterexamples cannot be 
guaranteed. Therefore, BMC is often classified as a technique 
for falsification, not for verification. Nowadays, BMC is based 
on solvers for satisfiability modulo theories (SMT) GD- 
k-Indnction. BMC-based approaches can be extended from 
falsification to verification by induction. Consider a program 
that contains a loop, and a safety property P. BMC with k = 1 
may show that no counterexample (a violation of P) of length 
k = 1 exists (a), but a longer counterexample might still exist. 
If, however, we are able to prove that for any given iteration 
through the loop where P holds before, P also holds after the 
iteration (b), the program is verified by induction, where (a) is 
the base case and (b) is the inductive-step case. Consider as 
a more formal example the standard induction principle over 
natural numbers: 

(P(0) A Vn : (P(n) ^ P{n + 1))) ^ Vn : P{n) 

This can be extended to greater values of k by asserting the 
safety property P for not only 1 but k consecutive predecessors 
in the step case, which is known as k-induction. fc-induction 
over natural numbers can be written as: 

^ y/y P{i) A Vn : A ^ P(n-|-A:)j^ ^ Vn : P{n) 

Intuitively, the induction proof is more likely to succeed for 
higher values of k, because the inductive-step case asserts 
the safety property for more consecutive predecessors, thus 
a less general case is checked. It holds that for fc > 1, 
{k — 1)-induction implies fc-induction and that therefore (fc—1)- 
induction must always be at least as hard as fc-induction ED- 
Invariants. An assertion p is called an invariant of a program 
if p is true for all states of that program p8| . If p is an assertion 
that specifies the safety property of a program and p is invariant, 
then the program is safe. Proving the invariance of an assertion 
is therefore a method of software verification. An assertion (p 


is called inductive, if it is provable by induction 1161. However, 
not every invariant assertion is inductive. One solution to this 
problem is trying to find an inductive assertion ip that is stronger 
than p, i.e., p ^ p. Trivially, if p is invariant then p is also 
invariant. This strengthening of assertions can be achieved by 
creating the conjunction of p and an auxiliary invariant p' such 
that p : = p A p' ||^. By choosing the auxiliary invariant in a 
way that excludes those unreachable "good" states that have 
transitions to "bad" successor states, the stronger invariant may 
be inductive where the weaker one was not. 


HI. K-iNDUCTION WITH INVARIANTS 

Our verification approach consists of two algorithms that run 
concurrently. One algorithm is responsible for the generation 
of program invariants, starting with imprecise invariants that 
are continuously refined (strengthened). The other algorithm 
is responsible for finding counterexamples with BMC and 
constructing safety proofs with fc-induction, for which it peri¬ 
odically picks up the invariants that the former algorithm has 
constructed so far. The fc-induction algorithm uses information 
from the invariant analysis, but not vice versa. 
Iterative-Deepening k-Induction. Algorithm [T] shows our 
extension of the fc-induction algorithm to a combination with 
continuously-refined invariants. Starting with an initial value for 
the bound fc, e.g., 1, we iteratively increase the value of fc after 
each unsuccessful attempt at finding a specification violation 
or proving correctness of the program using fc-induction. The 
following description of our approach to fc-induction is based 
on split-case fc-induction | [T9) , where for the propositional state 
variables s and s' within a state transition system representing 
the program, the predicate I{s) denotes that s is an initial state, 
T(s,s') states that a transition from s to s' exists, and P(s) 
asserts the safety property for the state s. 

Base Case. Lines to show the base case, which consists of 
mnning BMC with the current bound fc. This means that starting 
from an initial program state, all paths of the program up to 
a maximum loop bound fc are explored. (As an optimization, 
one can omit checking for property violations which have been 
checked in previous iterations with lower values of fc already.) 
Formally, there exists a counterexample of length at most fc if 
the following holds: 

k—1 /n—1 

I{so) A V A ) A 

n=0 \i=0 

If a counterexample is found, the algorithm terminates. 
Forward Condition. Otherwise we check whether there exists 
a path with length fc' > fc in the program, or whether we have 
already fully explored the state space of the program (lines 
to 1^. In the latter case the program is safe and the algorithm 
terminates. This check is called the forward condition |23| . 
Formally, the program was fully explored and is safe if the 
following is unsatisfiable: 

fc-i 

a f\ T{si,Si+i) 

i=0 
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Algorithm 1 Iterative-Deepening fc-Induction 

Input: 

the initial value kinu > 1 for the bound k, 
an upper limit kmax for the bound k, 
a function inc : N —)■ N with Vn G N : inc(n) > n 
for increasing the bound k, 
the initial states defined by the predicate I, 
the transfer relation defined by the predicate T, and 
a safety property P 

Output: true if P holds, false otherwise 

1- k .= kinit 
2: while k < kmax do 

k—1 /n—1 

3: base_case := I{so)/\ y I /\ T(si, Si+i) A ^P(s„) 

n—0 Vi—0 

4: if sat( &ase_case) then 

5: return false 

fc-i 

6: forward_condition := I(so) A /\ r(si,Si+i) 

i=0 

7: if sat{forward_condition) then 

8 : return true 

n+fc —1 

step_case^:= f\ {P{si) A T{si, s,+i)) 


Algorithm 2 Continuous Invariant Generation 

Input: 

a configurable program analysis with dynamic precision 
adjustment D, 

the initial states defined by the predicate I, 
a coarse initial precision ttq, 
a safety property P 

Output: true if P holds 


TT := TTo 

Inv := true 

loop 

reached := CPAAIgorithm(D,/(s), tt) 
if Vs G reached : P{s) then 
return true 



7: Inv := Inv A \/ s 

s G reached 

TT := RefinePrec(7r) 


A -^P{Sn+k) 

10: repeat 

11: Inv := get_currently_known_invariant() 

12: if ^ sat(3n G N : Inv{sn) A step_case„) then 

13: return true 

14: until Inv = get_currently_known_invariant() 

15: k := inc(/c) 

16: return unknown 


Inductive Step. Checking the forward condition can, however, 
only prove safety for programs with finite (and short) loops. 
Therefore the algorithm also attempts an inductive proof (lines 
to The base case for induction was already checked before. 
The inductive-step case checks that, after any sequence of 
k loop iterations without a property violation, there is also no 
property violation in loop iteration k 1. For model checking 
of software, however, this would often fail. The reason for this 
is that by induction we try to prove the property for every part 
of the state space of the program. Typically, a program has 
large parts of the state space that are unreachable, for which 
the property might not hold but which are irrelevant for the 
safety of the program. As an example, a typical loop in a 
program uses a loop counter which has only positive values, 
and with induction we would try to prove the property for all 
possible values of the loop counter, including negative values. 
The key to success for using induction for safety proofs of 
programs is thus to exclude as many unreachable parts of the 
state space as possible from the proof. This can be done by 
adding assumptions about program variables to the induction 
hypothesis. In our approach, we make use of the fact that 


the invariants that were generated so far by the concurrently- 
running invariant-generation algorithm hold, and conjunct these 
facts to the induction hypothesis. Thus, the inductive-step case 
can prove a program as safe if the following is unsatisfiable: 

n+fc— 1 

3n G N : Inv{sn) A A (-P(si) A T{si, Si+i)) A -^P{Sn+k) 


where Inv are the currently available program invariants. If 
this formula is satisfiable, the induction check is inconclusive, 
and the program cannot be proved as safe or unsafe with the 
current value of k and the current invariants. If during the time 
of the satisfiability check of the step case a new (stronger) 
invariant has become available (condition in line is false), 
we immediately recheck the step case with the new invariant. 
This can be done efficiently using an incremental SMT solver 


for the repeated satisfiability checks in line 12 Otherwise we 
start over with an increased value of k. 

Note that the inductive-step case is similar to BMC 
that checks for the presence of counterexamples of exactly 
length k 1. However, as the step case needs to consider 
any consecutive fc -|- 1 loop iterations, and not only the first 
such iterations, it does not assume that the execution of the 
loop iterations begins in the initial state. Instead, it assumes 
that there is a sequence of k iterations without any property 
violation (this is the induction hypothesis). 

Continuous Invariant Generation. Our continuous invariant 
generation incrementally produces stronger and stronger pro¬ 
gram invariants. It is based on an invariant-generation procedure 
that is run in a loop, each time with an increased precision. 
Each time the invariant has been strengthened, it can be used 
as auxiliary invariant by the fc-induction procedure. It may 
happen that this analysis proves safety of the program all by 
itself, but this is not its main purpose in our application. 
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Algorithm. Algorithm shows our continuous invariant gen¬ 
eration. The initial program invariant is represented by the 
formnla true. We start with rnnning the invariant-generation 
analysis once with a coarse initial precision. After each mn of 
the program-invariant generation, we strengthen the previously- 
known program invariants with the newly-generated invariants 
(line |7]l and annonnce it globally (snch that the fc-indnction 
algorithm can use it). If the analysis was able to prove safety of 
the program, the algorithm terminates (lines to [^. Otherwise, 
the analysis is restarted with a higher precision. 

Our approach works with any kind of invariant generation 
procedure, as long as its precision, i.e., its level of abstraction, is 
conhgurable. We use the reachability algorithm CPAAIgorithm 
for configurable program analysis with dynamic precision 
adjnstment Q. It takes as inpnt a conhgnrable program 
analysis (CPA), an initial abstract state, and a precision. It 
remrns a set of reachable abstract states that form an over¬ 
approximation of the reachable program state. This algorithm 
works with any abstract domain that can be formalized as 
a CPA. Depending on the nsed CPA and the precision, the 
analysis done by CPAAIgorithm can be efficient and abstract 
like data-flow analysis as well as expensive and precise like 
model checking. 

Abstract Domain. For the invariant generation we nse an 
abstract domain based on expressions over intervals. Note 
that this is not a reqnirement of onr approach, which works 
with any kind of domain. Our choice is based on the high 
flexibility of this domain, which can be fast and efficient as 
well as precise. 

The analysis is formalized and implemented as a CPA Q 
with dynamic precision adjustment Q. An abstract state of our 
invariant-generation domain consists of a mapping M : X ^ 
Expr from program variables to arithmetic expressions, where 
Expr is the set of expressions and X is the set of variables. The 
set Expr of expressions consists of binary expressions, unary 
expressions, program variables, and disjnnctions of intervals, 
and is defined recursively as Expr C {{Expr x B x Expr) U 
{U X Expr) U X U /), where B is the set of supported binary 
operators B = {-f, |, V,&, A,», <C, U}, U is 

the set of supported unary operators U = {-i, and I 

is the set of disjnnctions of intervals of the form [rt, 1] with 
u, I G ZUoo. The disjunctions of intervals allow for an efficient 
representation of ranges, and, nnlike in single-interval-based 
approaches, gaps between ranges can also be represented. 

Precision. In our CPA, the precision is a triple (Y, n, w), where 
Y C X is a specific selection of important program variables, 
n is the maximal nesting depth of expressions in the abstract 
state, and w is a boolean specifying whether widening should 
be used. Those variables that are considered important will not 
be over-approximated by merging abstract states. With a higher 
nesting depth, more precise relations between variables can be 
represented. The nse of widening ensures timely termination 
(at the expense of a lower precision) even for programs with 
loops with many iterations, like those in the examples [T] and 


Merge. Our CPA merges two abstract states if both states do 
not differ in the expressions that are stored for the important 
program variables from the set Y of the precision. This way, 
the loss of information resnlting from merging two abstract 
states does not affect the selected variables in Y. Natnrally, 
the more variables are in the precision, the fewer merges 
occnr, resnlting in a more precise but slower analysis. To 
gnarantee timely termination of the analysis even over loops 
with many iterations, like those shown in the examples [ 1 ] and 
a widening strategy for over-approximating variable values 
may be used when merging abstract states. Formally, for two 
abstract states ei,e 2 and a precision tt = {Y,n,w) the merge 
operator is defined as 

{ widen(ei,e 2 ) if w A ^differ^jei, 62 ) 
union(ei, 62 ) if A ^differ^(ei, 62 ) 
62 if difFer7r(ei, 62 ) 

with differ^(ci, 62 ) = 3v GY : ei{v) ^ e 2 {v). The operation 
union(ei,e 2 ) returns an abstract state where for each variable 
the union of the valnes for this variable in ei and 62 is used. 
The operation widen(ei,e 2 ) over-approximates by assigning 
to each variable only a single (potentially infinite) interval. 

Precision Refinement. The initial precision (0,1, true) for 
this analysis specifies an empty set of variables as important 
variables, i.e., abstract states belonging to the same program 
location are always merged (by applying widening). The 
maximnm expression-nesting depth of n = 1 means that 
abstract states map program variables to a single variable or 
to a disjunction of intervals (no arithmetic operators allowed). 

Our main refinement strategy is to add variables to the 
set Y of important program variables, first adding one variable, 
and then doubling the size of the set in each refinement step. 
When choosing variables for this step, we visit the control- 
flow antomaton backwards from the error location and pick 
variables that appear in assnme edges, snch that variables 
appearing in conditions close to the error location get added 
first. This refinement strategy is property-gnided, rather than 
counterexample-guided like CEGAR. 

Additionally, we have a refinement step that increments 
the expression-nesting depth to 2 , allowing more complex 
expressions, such as an addition of a variable with a disjunction 
of intervals; this refinement is helpful if an invariant x = y+\ is 
required, but the values of x and y cannot be over-approximated 
precisely enongh. The third refinement strategy is to disable 
the use of widening. Thus, the precision and the efficiency 
of the analysis is dynamically adjusted during the analysis. 
The maximal precision we use for our CPA is {X, 2, false) 
which tracks all program variables almost fully precisely. Of 
course, any other precision-refinement strategy applicable for 
the chosen CPA can be nsed for onr continnons invariant 
generation, too. 
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IV. Experimental Evaluation 

We compare our approach with other fc-induction-based 
approaches implemented in the same tool as well as with other 
A:-induction-based tools. 

Benchmark Verification Tasks. As benchmark set we use ver¬ 
ification tasks from the 2015 Competition on Software Verifica¬ 
tion (SV-COMP’ 15) We took all 2 814 verification tasks from 
the categories ControlFlow, DeviceDrivers64, HeapManipu- 
lation, Sequentialized, and Simple. The remaining categories 
were excluded because they use features (such as bitvectors, 
concurrency, and recursion) that not all configurations of our 
evaluation support. 742 verification tasks in the benchmark set 
contain a known specification violation. Although we cannot 
expect an improvement for these verification tasks when using 
auxiliary invariants, we did not exclude them because this 
would unfairly benefit our approach (which spends some effort 
generating invariants which are not helpful when proving 
existence of a counterexample). 

Experimental Setup. All experiments were conducted on 
computers with two 2.6 GHz 8-Core CPUs (Intel Xeon E5- 
2560 v2) with 135 GB of RAM. The operating system was 
Ubuntu 14.04 (64 bit), using Linux 3.13 and OpenJDK 1.7. 
Each verification task was limited to two CPU cores, a CPU 
mn time of 15 min and a memory consumption of 15GB. 

Presentation. All benchmarks, tools, and the full results of 
our evaluation are available on a supplementary web page|^ 

All reported times are rounded to two significant digits. We 
use the scoring scheme of SV-COMP’ 15 to calculate a score for 
each configuration. For every real bug found, 1 point is assigned, 
for every correct safety proof, 2 points are assigned. A score 
of 6 points is subtracted for every wrong alarm (false positive) 
reported by the tool, and 12 points are subtracted for every 
wrong proof of safety (false negative). This scoring scheme 
values proving safety higher than finding counterexamples, and 
significantly punishes wrong answers, which is in line with 
the community consensus Q on difficulty of verification vs. 
falsification and importance of correct results. We consider 
this a good fit for evaluating an approach such as fc-induction, 
which targets at producing safety proofs. 

In Fig. [^and|^ we present experimental results using a plot 
of quantile functions for accumulated scores as introduced by 
the Competition on Software Verification Q, which shows 
the score and CPU time for successful results and the score 
for wrong answers. A data point {x, y) of a graph means that 
for the respective configuration the sum of the scores of all 
wrong answers and the scores for all correct answers with 
a run time of less than or equal to y seconds is x. For the 
left-most point {x, y) of each graph, the x-value shows the 
sum of all negative scores for the respective configuration and 
the y-value shows the time for the fastest successful result. For 
the right-most point {x,y) of each graph, the x-value shows 
the total score for this configuration, and the y-value shows the 

' http ://s V- comp, sosy- lab. org/2015/ 

http ://w w w. sosy- lab.org/~ dbey er/cpa- k- induction/ 


maximal run time. A configuration can be considered better, 
the further to the right (the closer to 0) its graph begins (fewer 
wrong answers), the further to the right it ends (more correct 
answers), and the lower its graph is (less run time). 

Comparison of fc-induction-based approaches. To allow a 
meaningful evaluation of our approach, we implemented it 
together with other existing approaches in the same tool. 
We used the JAVA-based open-source software-verification 
framework CPAchecker fTo) , which is available online]^ under 
the Apache 2.0 license. For benchmarking, we used revi¬ 
sion 15 499 from the trunk of the CPAchecker repository, 
with MATHSAT5|^as SMT solver. The A:-induction algorithm of 
CPAchecker was configured to increment fc by 1 after each try 
(in Alg. 0 inc(/c) = fc -I- 1). The precision refinement of the 
continuous invariant generation was configured to increment the 
number of important program variables in the first, third, fifth, 
and any further precision refinements. The second precision 
refinement increments the expression-nesting depth, and the 
fourth precision refinement disables the widening operator. 

We evaluated the following fc-induction-based configura¬ 
tions; (1) without any auxiliary invariants, (2) with statically- 
generated invariants of different precisions, (3) with unsound 
invariants using a reimplementation of the heuristic of Es- 
BMc p9) , (4) with our new continuously-refined invariants. 

The fc-induction-based configuration using no 
auxiliary invariants is an instance of Alg. [T] where 
get_currently_known_invariant() always returns an empty set 
of invariants and Alg. does not run at all. 

The configurations using statically-generated invariants are 
also instances of Alg. Here, Alg. runs in parallel, 
however, it terminates after one loop iteration. We denote 
these configurations with triples (s, n, w) which represent the 
precision (F, n, w) of the invariant generation with s being the 
size of the set of important program variables (s = |F|). The 
first of these configuration is (0,1, true), which means that no 
variables are in the set Y of important program variables (i.e., 
all variables get over-approximated by the merge operator), 
the maximum nesting depth of expressions in the abstract 
state is 1, and the widening operator is used. The second 
configuration is (16, 2, true), which means that 16 variables 
are in the set Y, the nesting depth of expressions in the abstract 
state is limited to 2, and the widening operator is used. The 
third configuration is (16, 2,/o(se), where 16 variables are in 
the set Y, the maximum nesting depth of expressions in the 
abstract state is 2, and the widening operator is not used. These 
configurations were selected because they represent some of 
the extremes of the precisions used during dynamic invariant 
generation. It is, however, impossible to cover every possible 
valid configuration within the scope of this paper. 

The heuristic of Esbmc is to preserve information about 
variable values before the loop to help the step-case check to 
succeed. A sound technique for using pre-loop information in 
the step-case is to havoc the loop-modified variables, i.e., to 

'■http://cpachecker.sosy-lab.org 

^ http://mathsat.fbk.eu 
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TABLE I: Results of fc-induction-based configurations in 
CPAchecker for all 2 814 verification tasks with different 
approaches for generation of auxiliary invariants 


Invariant 



Static 


Esbmc 

cont.- 

generation 


(0,1, £) (16, 2,£) (16,2,/) heuristic 

refined 

Score 

1557 

3184 

3 263 

3177 

204 

3464 

Correct results 

1036 

1852 

1893 

1849 

1827 

1981 

Wrong proofs 

2 

1 

1 

2 

263 

1 

Wrong alarms 

12 

10 

11 

9 

8 

7 

CPU time (h) 

400 

200 

190 

200 

140 

170 

Wall time (h) 

380 

150 

130 

120 

130 

100 

Times for correct results only: 





CPU time (h) 

7.1 

14 

15 

13 

8.8 

17 

Wall time (h) 

5.7 

8.4 

8.9 

7.6 

6.9 

9.4 

/c-Values for coiTect safe results only: 




Max. final k 

101 

101 

101 

119 

120 

88 

Avg. final k 

2.4 

2.0 

2.3 

2.3 

2.0 

2.4 


remove all information about these loop-modified variables, 
but keep all other information p^ , effectively propagating con¬ 
stants to the step case. Esbmc, however, heuristically selects only 
those variables for havocing that appear in loop-termination 
conditions | |29) . This technique is easier and computationally 
cheaper than generating sound auxiliary invariants, but may 
lead to wrong verihcation results, as shown in Sec. for our 
Example 

Score. Using the unsound heuristic of Esbmc for invariant 
generation produces 263 wrong proofs, which shows that 
this is not a suitable approach for proving program safety. 
In contrast, the few wrong proofs produced by the other 
configurations are not due to conceptual problems, but only 
due to incompletenesses in the analyzer’s handling of certain 
constructs such as unbounded arrays and pointer aliasing. 

The conhguration with no invariant generation receives the 
second-lowest score of 1 557, and (as expected) can verify only 
1 036 programs successfully, producing more than 800 results 
less than any of the configurations that use sound auxiliary 
invariants. This shows that it is indeed important in practice to 
enhance /c-induction-based software verihcation with invariants. 

The conhgurations using static invariant generation pro¬ 
duce 1852, 1 893, and 1 849 correct results and achieve scores 
of 3 184, 3 263, and 3 177 points, respectively. These results are 
close to each other, but improve upon the results of the plain 
/c-induction without auxiliary invariants by a score of 1 600 
to 1700. 

This observation explains the high score of 3 464 points 
achieved by our approach using continuously-rehned invariant 
generation. By combining the advantages of fast and coarse 
precisions with those of slow but hne precisions, it correctly 
solves 1981 verihcation tasks, which is 88 more correct results 
than the best of the chosen static conhgurations. It is thus clearly 
the best of all evaluated fc-induction-based approaches. 

Performance. Table |I] shows that the fastest conhguration in 
terms of CPU time is the unsound approach, which is easily 
explained by the fact that it often produces incorrect proofs 
after analyzing a low number of loop iterations of the program. 


Due to the vast amount of wrong results, the speed of the 
approach can hardly be considered a success. 

By far the highest amount of time is spent by the con¬ 
hguration using no auxiliary invariants, because for those 
programs that cannot be proved without auxiliary invariants, 
the fc-induction procedure loops incrementing k until the time 
limit is reached. For the sound conhgurations, the wall times 
for the correct results correlate with the amount of correct 
results, i.e., on average about the same amount of time is spent 
on correct verihcations, whether or not invariant generation 
is used. This shows that the overhead of generating auxiliary 
invariants is well-compensated. 

The conhgurations with static and continuously-rehned 
invariant generation have a relatively higher CPU time com¬ 
pared to their wall time because these conhgurations spend 
some time generating invariants in parallel to the fc-induction 
algorithm. The results show, however, that the time spent for 
the continuously-rehned invariant generation clearly pays off 
as this conhguration is not only the one with the most correct 
results, but at the same time the fastest sound conhguration 
with only 170 h in total (20 h less than the second-fastest sound 
conhguration). The fact that the accumulated wall time (9.4 h) 
it spent on correct results is slightly higher than for most of the 
other sound conhgurations is simply because it produced more 
correct results. The accumulated CPU time (17 h) spent on 
correct results is higher than for most of the other conhgurations 
partly due to the same reason, but also because of the multiple 
iterations of the invariant-generation algorithm as opposed to 
only one iteration for the conhgurations using static invariant 
generation or even zero iterations for the conhguration using 
no invariant generation and the unsound conhguration using the 
Esbmc heuristic. Even though it produced much more correct 
results, the conhguration using continuous invariant generation 
did not exceed the times of the chosen conhgurations using 
static invariant generation (> 170 h). 

These results show that the additional effort invested in 
generating sound auxiliary invariants is well-spent, as it even 
decreases the overall time due to the fewer timeouts. As 
expected, the continuously-rehned invariants solve many tasks 
quicker than the conhgurations using invariant generation with 
high static precisions. 

Final value of k. The bottom of Table |I] shows some statistics 
about the hnal values of fc for the correct safety proofs. There 
is no difference between the maximum fc values for the conhg¬ 
uration using no auxiliary invariants, the conhguration (0, l,f) 
using low-precision invariants, and the conhguration (16, 2, f) 
using medium-precision invariants. The conhguration using 
static invariant generation with high-precision and the unsound 
conhguration using the Esbmc heuristic have higher maximum 
hnal values of fc, with 119 for the high-precision conhgura¬ 
tion (16, 2, /) and 120 for the unsound conhguration. The logs 
revealed that this unique deviation of the high-precision static 
invariant-generation conhguration was caused by a situation 
where the static invariant generation completed only shortly 
before the timeout (fc = 119 instead of fc = 101). For the 
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TABLE II: Results of A:-induction-based tools for all 2 814 ver¬ 
ification tasks 


Tool 

Cbmc 

Esbmc 

CPAchecker 

Configuration 


sequential 

parallel 

cont. refined 

Score 

-971 

1659 

2 027 

3464 

Correct results 

1216 

2214 

2137 

1981 

Wrong proofs 

261 

184 

137 

1 

Wrong alarms 

4 

28 

24 

7 

CPU time (h) 

350 

100 

130 

170 

Wall time (h) 

350 

100 

76 

100 

Times for correct results only: 



CPU time (h) 

1.9 

34 

25 

17 

Wall time (h) 

1.9 

34 

14 

9.4 

A:-Values for correct safe results only: 



Max. final k 

50 

100 

100 

88 

Avg. final k 

1.1 

8.4 

7.4 

2.4 


unsound configuration, there was one case where due to 
the low overhead of the approach, the iterative deepening 
of k progressed quickly up until the value 120, where the 
Ic-induction proof then succeeded. The configuration using 
continuously-refined invariants, on the other hand, has a 
significantly lower maximum final fc-value than the other 
configurations. This is due to the following two reasons: 
First, with continuously-refined invariants, less time is wasted 
on generating unnecessarily strong invariants than for static 
high-precision configurations, and the proofs terminate before 
high values of k are reached. Second, the dynamicity of the 
approach allows for generating stronger invariants than static 
low-precision configurations, thus reducing the value of k 
required for the proof to succeed. 

Comparison with other tools. For comparison with other 
A:-induction-based tools, we evaluated Esbmc and Cbmc, two 
other successful software model checkers with support for 
^-induction. The CPAchecker configuration in this comparison 
is the same as the one above using continuously-refined invari¬ 
ants. For Cbmc, we used the latest version 5.0 in combination 
with a wrapper script for split-case fc-induction provided by 
Michael Tautschnig. For Esbmc we used the latest version 2.24.1 
in combination with the wrapper script of their submission 
to the 2013 Competition on Software Verification ]29| (the 
script configures Esbmc to use fc-induction). We also provide 
results for the experimental parallel Ic-inductlon of Esbmc, but 
note that our benchmark setup Is not focused on parallelization 
(using only two CPU cores and a CPU-time limit instead 
of a wall-time limit). Table summarizes the results; Fig. 
shows the quantile functions of the accumulated scores for each 
configuration. The results for Cbmc are not competitive, which 
may be attributed to the experimental nature of its ^-induction 
support. 

Score. Both configurations of Esbmc produce a significant 
number of wrong results. All tools do produce some wrong 
answers, which are probably related to unsoundness and 
imprecision in the handling of some C features. CPAchecker 
with fc-induction and sound invariants has only 1 missed bug 
(i.e, wrong claim of safety), whereas Esbmc, in the sequential 
version, has 184 wrong safety proofs. This large number of 



-3000 -2000 -1000 0 1000 2000 3000 4000 

Accumulated score 


Fig. 3: Quantile functions of fc-induction-based tools for 
accumulated scores showing the CPU time for the successful 
results; linear scale between 0 s and 1 s, logarithmic scale 
beyond 

wrong results must be attributed to the unsound heuristic of 
Esbmc for strengthening the induction hypothesis, where it 
retains potentially incorrect information about loop-modified 
variables. The large number of wrong proofs reduces the 
confidence in the soundness of the correct proofs. Consequently, 
the score achieved by CPAchecker with continuously-refined 
invariants is much higher than the score of Esbmc (3464 instead 
of 2 027 points). This clear advantage is also visible in Fig. 

When comparing the results of Esbmc to CPAchecker with 
a reimplemention of the unsound heuristic of Esbmc, we 
see that Esbmc produces fewer wrong results. The reason 
for this difference is that the heuristic only works well if 
relevant variables are identified on loop-exit conditions. Due to 
CPAchecker’s encoding of multiple loops In a program Into a 
single loop for fc-induction, the number of loop-exit conditions 
is smaller than in the original program, and the heuristic 
performs worse. However, even with the implementation in 
Esbmc, this unsound heuristic produces so many wrong results 
that it is not suited for verifying program safety. 

The parallel version of Esbmc performs somewhat better than 
its sequential version, and misses fewer bugs. This is due to 
the fact that the base case and the step case are performed in 
parallel, and the loop bound k is incremented independently 
for each of them. The base case is usually easier to solve for 
the SMT solver, and thus the base-case checks proceed faster 
than the step-case checks (reaching a higher value of k sooner). 
Therefore, the parallel version manages to find some bugs by 
reaching the relevant k in the base-case checks earlier than 
in the step-case checks, which would produce a wrong safety 
proof at reaching k. However, the number of wrong proofs is 
still much higher than with our approach, which is conceptually 
sound. Thus, our score is more than 1400 points higher. 

Performance. Table shows that. If only the times for correct 
results are considered, our approach is considerably faster than 
Esbmc (Cbmc has so few correct results that the time for them Is 
even less). This indicates that due to our invariants, we succeed 
more often with fewer loop unrollings and thus in less time. It 
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Fig. 4; Scatter plot of the final value of k for all safe programs 
verified successfully by both CPAchecker (continuously-refined 
invariants) and Esbmc (sequential) with fc-induction; the color 
of each data point indicates the number of programs solved 
with this value of k 


also shows that the effort invested for generating the invariants 
is well spent. If considering the total time for the analysis of 
all results, CPAchecker needs more time. This is due to the 
fact that these measurements are dominated by those programs 
for which the tool runs into a timeout, and CPAchecker has 
more timeouts, whereas Esbmc has more wrong results (for 
which less time is spent). A timeout is generally preferable to 
a wrong result, though. 

Final value of k. The bottom of Table[n|contains some statistics 
on the final value of k that was needed to verify a program. 
Figure shows a scatter plot comparing the values of the loop 
bound k for CPAchecker with continuously-refined invariants 
and Esbmc in its sequential version. Both axes and the color 
range have a logarithmic scale. Data points are shown only 
for those 1460 verification tasks that can be proved safe by 
both configurations. A point in the lower right half means that 
CPAchecker needed a lower k (fewer loop unrollings) than 
Esbmc for the same verification task. The color of each data 
point gives an indication of how many verification tasks are 
represented by the data point. For example, the dark point 
at (2,1) signifies that there are 845 programs that can be 
verified by CPAchecker with a final value of fc = 1, whereas 
Esbmc needs k = 2 for these programs. 

The table shows that for safe programs, CPAchecker only 
needs a loop bound that is (on average) less than a third of the 
loop bound that Esbmc needs. The bottom of the plot shows 
that there are many programs (including the 845 programs at 
(2,1)) that CPAchecker verifies with only one loop unrolling, 
but for which Esbmc needs to unroll the loops more often. To 
the right of the plot, there is also a group of programs for which 
Esbmc needs a k between 45 and 65 to verify the program. 



Fig. 5: Quantile functions for accumulated scores showing the 
CPU time for the successful results 


and CPAchecker succeeds with significantly smaller k. There 
are only four programs for which CPAchecker needs a k larger 
than 32 (one program for k = 40, k = 45, k = 50, and fc = 88 
each). For Esbmc, the largest number of loop unrollings is 100, 
which is necessary for 71 programs. These advantages are due 
to the use of generated invariants, which make the induction 
proofs easier and likely to succeed with a smaller number of k. 
There is also a group of programs where Esbmc succeeds with 
2 loop unrollings but CPAchecker needs up to 16. However, 
the number of such programs is relatively small (note that 
the data points with a green-to-orange color only represent 
1 to 9 programs) and there is only a single program where 
CPAchecker unrolls the loops more than 10 times more than 
Esbmc (while there are many with the reverse being true). 
The reason why Esbmc needs fewer loop unrollings for some 
programs is its (unsound) heuristic of keeping information 
about some program variables from the initial program state 
in the inductive-step case. 

Comparison with other approaches. We also compare with 
the predicate-abstraction implementation of CPAchecker Gil, 
which uses the same framework (parser, formula encoding, etc.) 
and SMT solver as our implementation of fc-induction. The 
score-based quantile functions for our fc-induction approach 
and the existing predicate abstraction in Fig. show that 
the latter is somewhat faster and achieves a higher score. 
It is surprising that even the well-tuned CPAchecker imple¬ 
mentation of the mature predicate-abstraction approach only 
slightly outperforms our novel fc-induction implementation. 
The difference in performance and score between these two 
configurations is much smaller than the improvement of our 
approach compared to existing fc-induction-based approaches 
(cf Fig. 0- This is a promising result, considering that there is 
room for improvement in our approach. Especially the invariant 
generation could be further enhanced, e.g., by tailoring the 
invariant generation to the special needs of the fc-induction 
proof, and a more targeted invariant-refinement procedure. 

Acknowledgments. We would like to thank M. Tautschnig and 
L. Cordeiro for explaining the optimal available configuration 
for fc-induction, for the verifiers Cbmc and Esbmc, respectively. 
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V. Conclusion 

We have presented the novel idea of combining /c-induction 
with continuously-refined invariants, and contribute a publicly 
available implementation of our idea within the software- 
verification framework CPAchecker. Our extensive experiments 
show that our approach outperforms all existing implementa¬ 
tions of fc-induction for software verification, and that it is 
competitive compared to other, more mature techniques for 
software verification. We showed that a sound, effective, and 
efficient fc-induction approach to general purpose software 
verification is possible, and that the additional resources 
required to achieve these combined benefits are negligible 
if invested judiciously. At the same time, there is still room 
for improvement of our technique. In the fumre, we plan to 
integrate successful features of other approaches to ^-induction 
such as the parallel algorithm of Esbmc. The experiments 
with Esbmc show that we can avoid more timeouts on unsafe 
programs by running the iteratively-deepening BMC decoupled 
from the slower inductive-step case. We are also interested 
in adding an information flow between the two cooperating 
algorithms in the reverse direction. If the /c-induction procedure 
could tell the invariant generation which facts it misses to prove 
safety, this could lead to a more efficient and effective approach 
that generates invariants that are specifically tailored to the 
needs of the fc-induction proof. Already now, CPAchecker is 
parsimonious in terms of unrollings, compared to other tools. 
The low fc-values required to prove many programs show that 
even our current invariant generation is powerful enough to 
produce invariants that are strong enough to help cut down the 
necessary number of loop unrollings, fc-induction-guided pre¬ 
cision refinement might direct the invariant generation towards 
providing weaker but still useful invariants for fc-induction 
more efficiently. 
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